Overview

Dataset statistics

Number of variables12
Number of observations11127
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory96.0 B

Variable types

Numeric6
Text5
Categorical1

Alerts

ratings_count is highly overall correlated with text_reviews_countHigh correlation
text_reviews_count is highly overall correlated with ratings_countHigh correlation
language_code is highly imbalanced (76.6%)Imbalance
isbn13 is highly skewed (γ1 = -21.07028799)Skewed
bookID has unique valuesUnique
isbn has unique valuesUnique
text_reviews_count has 625 (5.6%) zerosZeros

Reproduction

Analysis started2024-03-03 14:58:51.657057
Analysis finished2024-03-03 14:58:58.137990
Duration6.48 seconds
Software versionydata-profiling vv4.6.5
Download configurationconfig.json

Variables

bookID
Real number (ℝ)

UNIQUE 

Distinct11127
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21310.939
Minimum1
Maximum45641
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size87.1 KiB
2024-03-03T15:58:58.272772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1800.3
Q110287
median20287
Q332104.5
95-th percentile43066.5
Maximum45641
Range45640
Interquartile range (IQR)21817.5

Descriptive statistics

Standard deviation13093.358
Coefficient of variation (CV)0.61439611
Kurtosis-1.1463568
Mean21310.939
Median Absolute Deviation (MAD)10879
Skewness0.14405166
Sum2.3712682 × 108
Variance1.7143602 × 108
MonotonicityNot monotonic
2024-03-03T15:58:58.413352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34889 1
 
< 0.1%
28532 1
 
< 0.1%
28510 1
 
< 0.1%
28511 1
 
< 0.1%
28514 1
 
< 0.1%
28522 1
 
< 0.1%
28524 1
 
< 0.1%
28529 1
 
< 0.1%
28530 1
 
< 0.1%
28531 1
 
< 0.1%
Other values (11117) 11117
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
ValueCountFrequency (%)
45641 1
< 0.1%
45639 1
< 0.1%
45634 1
< 0.1%
45633 1
< 0.1%
45631 1
< 0.1%
45630 1
< 0.1%
45626 1
< 0.1%
45625 1
< 0.1%
45623 1
< 0.1%
45617 1
< 0.1%

title
Text

Distinct10352
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Memory size87.1 KiB
2024-03-03T15:58:58.694448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length254
Median length141
Mean length35.749348
Min length2

Characters and Unicode

Total characters397783
Distinct characters296
Distinct categories17 ?
Distinct scripts8 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9865 ?
Unique (%)88.7%

Sample

1st rowBrown's Star Atlas: Showing All The Bright Stars With Full Instructions How To Find And Use Them For Navigational Purposes And Department Of Trade Examinations.
2nd rowThe Tolkien Fan's Medieval Reader
3rd rowStreetcar Suburbs: The Process of Growth in Boston 1870-1900
4th rowPatriots (The Coming Collapse)
5th rowHarry Potter and the Half-Blood Prince (Harry Potter #6)
ValueCountFrequency (%)
the 6692
 
10.1%
of 3336
 
5.0%
and 1653
 
2.5%
a 1335
 
2.0%
1 796
 
1.2%
in 778
 
1.2%
to 698
 
1.1%
588
 
0.9%
2 519
 
0.8%
3 399
 
0.6%
Other values (12076) 49535
74.7%
2024-03-03T15:58:59.249155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
58892
14.8%
e 36631
 
9.2%
o 23592
 
5.9%
a 22308
 
5.6%
i 20615
 
5.2%
r 20216
 
5.1%
n 20032
 
5.0%
t 19155
 
4.8%
s 16700
 
4.2%
h 13698
 
3.4%
Other values (286) 145944
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 265634
66.8%
Space Separator 58892
 
14.8%
Uppercase Letter 52413
 
13.2%
Other Punctuation 8579
 
2.2%
Decimal Number 5487
 
1.4%
Close Punctuation 2765
 
0.7%
Open Punctuation 2764
 
0.7%
Dash Punctuation 808
 
0.2%
Other Letter 373
 
0.1%
Math Symbol 27
 
< 0.1%
Other values (7) 41
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
4.3%
13
 
3.5%
13
 
3.5%
13
 
3.5%
13
 
3.5%
13
 
3.5%
13
 
3.5%
13
 
3.5%
11
 
2.9%
11
 
2.9%
Other values (144) 244
65.4%
Lowercase Letter
ValueCountFrequency (%)
e 36631
13.8%
o 23592
 
8.9%
a 22308
 
8.4%
i 20615
 
7.8%
r 20216
 
7.6%
n 20032
 
7.5%
t 19155
 
7.2%
s 16700
 
6.3%
h 13698
 
5.2%
l 12851
 
4.8%
Other values (48) 59836
22.5%
Uppercase Letter
ValueCountFrequency (%)
T 7023
 
13.4%
S 4510
 
8.6%
A 3662
 
7.0%
C 3562
 
6.8%
M 3165
 
6.0%
B 2677
 
5.1%
W 2612
 
5.0%
P 2599
 
5.0%
L 2507
 
4.8%
D 2490
 
4.8%
Other values (23) 17606
33.6%
Other Punctuation
ValueCountFrequency (%)
: 3025
35.3%
# 2431
28.3%
' 1397
16.3%
. 709
 
8.3%
/ 414
 
4.8%
& 258
 
3.0%
! 135
 
1.6%
? 70
 
0.8%
; 59
 
0.7%
" 50
 
0.6%
Other values (8) 31
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 1724
31.4%
2 855
15.6%
3 645
 
11.8%
4 415
 
7.6%
0 396
 
7.2%
9 371
 
6.8%
5 347
 
6.3%
6 282
 
5.1%
8 233
 
4.2%
7 219
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 773
95.7%
18
 
2.2%
15
 
1.9%
2
 
0.2%
Nonspacing Mark
ValueCountFrequency (%)
́ 3
50.0%
̈ 2
33.3%
̌ 1
 
16.7%
Close Punctuation
ValueCountFrequency (%)
) 2756
99.7%
] 9
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 2755
99.7%
[ 9
 
0.3%
Math Symbol
ValueCountFrequency (%)
+ 18
66.7%
= 9
33.3%
Final Punctuation
ValueCountFrequency (%)
12
92.3%
1
 
7.7%
Other Number
ValueCountFrequency (%)
½ 11
91.7%
² 1
 
8.3%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
58892
100.0%
Modifier Letter
ValueCountFrequency (%)
3
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 318031
80.0%
Common 79357
 
19.9%
Han 232
 
0.1%
Katakana 72
 
< 0.1%
Hiragana 61
 
< 0.1%
Cyrillic 16
 
< 0.1%
Arabic 8
 
< 0.1%
Inherited 6
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
11
 
4.7%
8
 
3.4%
8
 
3.4%
Other values (90) 114
49.1%
Latin
ValueCountFrequency (%)
e 36631
 
11.5%
o 23592
 
7.4%
a 22308
 
7.0%
i 20615
 
6.5%
r 20216
 
6.4%
n 20032
 
6.3%
t 19155
 
6.0%
s 16700
 
5.3%
h 13698
 
4.3%
l 12851
 
4.0%
Other values (73) 112233
35.3%
Common
ValueCountFrequency (%)
58892
74.2%
: 3025
 
3.8%
) 2756
 
3.5%
( 2755
 
3.5%
# 2431
 
3.1%
1 1724
 
2.2%
' 1397
 
1.8%
2 855
 
1.1%
- 773
 
1.0%
. 709
 
0.9%
Other values (38) 4040
 
5.1%
Hiragana
ValueCountFrequency (%)
16
26.2%
5
 
8.2%
3
 
4.9%
3
 
4.9%
3
 
4.9%
3
 
4.9%
3
 
4.9%
3
 
4.9%
2
 
3.3%
2
 
3.3%
Other values (14) 18
29.5%
Katakana
ValueCountFrequency (%)
11
15.3%
11
15.3%
11
15.3%
5
 
6.9%
5
 
6.9%
5
 
6.9%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
Other values (14) 16
22.2%
Cyrillic
ValueCountFrequency (%)
а 4
25.0%
р 3
18.8%
М 2
12.5%
т 2
12.5%
и 2
12.5%
с 1
 
6.2%
е 1
 
6.2%
г 1
 
6.2%
Arabic
ValueCountFrequency (%)
م 2
25.0%
ل 2
25.0%
ح 1
12.5%
ا 1
12.5%
ن 1
12.5%
د 1
12.5%
Inherited
ValueCountFrequency (%)
́ 3
50.0%
̈ 2
33.3%
̌ 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 397016
99.8%
None 319
 
0.1%
CJK 232
 
0.1%
Katakana 75
 
< 0.1%
Hiragana 61
 
< 0.1%
Punctuation 50
 
< 0.1%
Cyrillic 16
 
< 0.1%
Arabic 8
 
< 0.1%
Diacriticals 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
58892
14.8%
e 36631
 
9.2%
o 23592
 
5.9%
a 22308
 
5.6%
i 20615
 
5.2%
r 20216
 
5.1%
n 20032
 
5.0%
t 19155
 
4.8%
s 16700
 
4.2%
h 13698
 
3.5%
Other values (75) 145177
36.6%
None
ValueCountFrequency (%)
é 62
19.4%
á 35
11.0%
ó 31
 
9.7%
í 28
 
8.8%
ä 19
 
6.0%
ü 17
 
5.3%
ñ 14
 
4.4%
½ 11
 
3.4%
11
 
3.4%
è 11
 
3.4%
Other values (28) 80
25.1%
Punctuation
ValueCountFrequency (%)
18
36.0%
15
30.0%
12
24.0%
2
 
4.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Hiragana
ValueCountFrequency (%)
16
26.2%
5
 
8.2%
3
 
4.9%
3
 
4.9%
3
 
4.9%
3
 
4.9%
3
 
4.9%
3
 
4.9%
2
 
3.3%
2
 
3.3%
Other values (14) 18
29.5%
CJK
ValueCountFrequency (%)
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
13
 
5.6%
11
 
4.7%
8
 
3.4%
8
 
3.4%
Other values (90) 114
49.1%
Katakana
ValueCountFrequency (%)
11
14.7%
11
14.7%
11
14.7%
5
 
6.7%
5
 
6.7%
5
 
6.7%
3
 
4.0%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (15) 18
24.0%
Cyrillic
ValueCountFrequency (%)
а 4
25.0%
р 3
18.8%
М 2
12.5%
т 2
12.5%
и 2
12.5%
с 1
 
6.2%
е 1
 
6.2%
г 1
 
6.2%
Diacriticals
ValueCountFrequency (%)
́ 3
50.0%
̈ 2
33.3%
̌ 1
 
16.7%
Arabic
ValueCountFrequency (%)
م 2
25.0%
ل 2
25.0%
ح 1
12.5%
ا 1
12.5%
ن 1
12.5%
د 1
12.5%
Distinct6643
Distinct (%)59.7%
Missing0
Missing (%)0.0%
Memory size87.1 KiB
2024-03-03T15:58:59.591127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length750
Median length372
Mean length24.724005
Min length3

Characters and Unicode

Total characters275104
Distinct characters267
Distinct categories11 ?
Distinct scripts8 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5282 ?
Unique (%)47.5%

Sample

1st rowBrown Son & Ferguson
2nd rowDavid E. Smith (Turgon of TheOneRing.net one of the founding members of this Tolkien website)/Verlyn Flieger/Turgon (=David E. Smith)
3rd rowSam Bass Warner Jr./Sam B. Warner
4th rowJames Wesley Rawles
5th rowJ.K. Rowling/Mary GrandPré
ValueCountFrequency (%)
john 279
 
0.8%
william 262
 
0.8%
james 228
 
0.7%
david 203
 
0.6%
a 191
 
0.6%
robert 185
 
0.5%
j 181
 
0.5%
stephen 176
 
0.5%
richard 157
 
0.5%
m 155
 
0.5%
Other values (12644) 31795
94.0%
2024-03-03T15:59:00.176105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 23758
 
8.6%
23425
 
8.5%
a 22547
 
8.2%
r 18119
 
6.6%
n 17426
 
6.3%
i 15677
 
5.7%
o 14415
 
5.2%
l 13273
 
4.8%
s 9822
 
3.6%
t 9527
 
3.5%
Other values (257) 107115
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 195795
71.2%
Uppercase Letter 43440
 
15.8%
Space Separator 23425
 
8.5%
Other Punctuation 11849
 
4.3%
Other Letter 378
 
0.1%
Dash Punctuation 200
 
0.1%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Decimal Number 4
 
< 0.1%
Format 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا 25
 
6.6%
ر 22
 
5.8%
ن 19
 
5.0%
ل 18
 
4.8%
ب 15
 
4.0%
م 13
 
3.4%
ج 12
 
3.2%
9
 
2.4%
ي 9
 
2.4%
8
 
2.1%
Other values (99) 228
60.3%
Lowercase Letter
ValueCountFrequency (%)
e 23758
12.1%
a 22547
11.5%
r 18119
9.3%
n 17426
 
8.9%
i 15677
 
8.0%
o 14415
 
7.4%
l 13273
 
6.8%
s 9822
 
5.0%
t 9527
 
4.9%
h 7628
 
3.9%
Other values (86) 43603
22.3%
Uppercase Letter
ValueCountFrequency (%)
M 3731
 
8.6%
S 3497
 
8.1%
J 3286
 
7.6%
C 3074
 
7.1%
R 2655
 
6.1%
A 2611
 
6.0%
B 2520
 
5.8%
D 2490
 
5.7%
H 2232
 
5.1%
P 2200
 
5.1%
Other values (37) 15144
34.9%
Other Punctuation
ValueCountFrequency (%)
/ 8117
68.5%
. 3577
30.2%
' 148
 
1.2%
! 4
 
< 0.1%
" 2
 
< 0.1%
& 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 2
50.0%
1 1
25.0%
2 1
25.0%
Space Separator
ValueCountFrequency (%)
23425
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 200
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Format
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 239152
86.9%
Common 35489
 
12.9%
Arabic 187
 
0.1%
Han 173
 
0.1%
Greek 53
 
< 0.1%
Cyrillic 30
 
< 0.1%
Hiragana 18
 
< 0.1%
Inherited 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 23758
 
9.9%
a 22547
 
9.4%
r 18119
 
7.6%
n 17426
 
7.3%
i 15677
 
6.6%
o 14415
 
6.0%
l 13273
 
5.6%
s 9822
 
4.1%
t 9527
 
4.0%
h 7628
 
3.2%
Other values (88) 86960
36.4%
Han
ValueCountFrequency (%)
9
 
5.2%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
Other values (56) 92
53.2%
Arabic
ValueCountFrequency (%)
ا 25
13.4%
ر 22
11.8%
ن 19
10.2%
ل 18
9.6%
ب 15
 
8.0%
م 13
 
7.0%
ج 12
 
6.4%
ي 9
 
4.8%
ی 7
 
3.7%
خ 6
 
3.2%
Other values (19) 41
21.9%
Greek
ValueCountFrequency (%)
ο 7
13.2%
α 5
 
9.4%
υ 4
 
7.5%
ς 4
 
7.5%
λ 4
 
7.5%
ί 3
 
5.7%
ι 3
 
5.7%
κ 2
 
3.8%
τ 2
 
3.8%
π 2
 
3.8%
Other values (17) 17
32.1%
Cyrillic
ValueCountFrequency (%)
а 5
16.7%
л 4
13.3%
и 3
 
10.0%
н 2
 
6.7%
о 2
 
6.7%
в 2
 
6.7%
А 1
 
3.3%
В 1
 
3.3%
ь 1
 
3.3%
е 1
 
3.3%
Other values (8) 8
26.7%
Common
ValueCountFrequency (%)
23425
66.0%
/ 8117
 
22.9%
. 3577
 
10.1%
- 200
 
0.6%
' 148
 
0.4%
) 5
 
< 0.1%
( 5
 
< 0.1%
! 4
 
< 0.1%
9 2
 
< 0.1%
" 2
 
< 0.1%
Other values (4) 4
 
< 0.1%
Hiragana
ValueCountFrequency (%)
2
11.1%
2
11.1%
2
11.1%
2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (4) 4
22.2%
Inherited
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 274080
99.6%
None 613
 
0.2%
Arabic 187
 
0.1%
CJK 173
 
0.1%
Cyrillic 30
 
< 0.1%
Hiragana 18
 
< 0.1%
Punctuation 2
 
< 0.1%
Latin Ext Additional 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 23758
 
8.7%
23425
 
8.5%
a 22547
 
8.2%
r 18119
 
6.6%
n 17426
 
6.4%
i 15677
 
5.7%
o 14415
 
5.3%
l 13273
 
4.8%
s 9822
 
3.6%
t 9527
 
3.5%
Other values (56) 106091
38.7%
None
ValueCountFrequency (%)
é 115
18.8%
í 85
13.9%
á 57
 
9.3%
ō 45
 
7.3%
ó 34
 
5.5%
ë 20
 
3.3%
è 20
 
3.3%
ü 19
 
3.1%
ł 17
 
2.8%
ï 15
 
2.4%
Other values (62) 186
30.3%
Arabic
ValueCountFrequency (%)
ا 25
13.4%
ر 22
11.8%
ن 19
10.2%
ل 18
9.6%
ب 15
 
8.0%
م 13
 
7.0%
ج 12
 
6.4%
ي 9
 
4.8%
ی 7
 
3.7%
خ 6
 
3.2%
Other values (19) 41
21.9%
CJK
ValueCountFrequency (%)
9
 
5.2%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
8
 
4.6%
Other values (56) 92
53.2%
Cyrillic
ValueCountFrequency (%)
а 5
16.7%
л 4
13.3%
и 3
 
10.0%
н 2
 
6.7%
о 2
 
6.7%
в 2
 
6.7%
А 1
 
3.3%
В 1
 
3.3%
ь 1
 
3.3%
е 1
 
3.3%
Other values (8) 8
26.7%
Hiragana
ValueCountFrequency (%)
2
11.1%
2
11.1%
2
11.1%
2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (4) 4
22.2%
Punctuation
ValueCountFrequency (%)
2
100.0%
Latin Ext Additional
ValueCountFrequency (%)
1
100.0%

average_rating
Real number (ℝ)

Distinct209
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9336308
Minimum0
Maximum5
Zeros26
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:00.348145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.44
Q13.77
median3.96
Q34.135
95-th percentile4.38
Maximum5
Range5
Interquartile range (IQR)0.365

Descriptive statistics

Standard deviation0.35244503
Coefficient of variation (CV)0.089597893
Kurtosis36.721777
Mean3.9336308
Median Absolute Deviation (MAD)0.18
Skewness-3.6383114
Sum43769.51
Variance0.1242175
MonotonicityNot monotonic
2024-03-03T15:59:00.518225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 219
 
2.0%
3.96 195
 
1.8%
4.02 178
 
1.6%
3.94 176
 
1.6%
4.07 172
 
1.5%
3.92 168
 
1.5%
3.93 168
 
1.5%
4.05 168
 
1.5%
3.83 166
 
1.5%
3.89 166
 
1.5%
Other values (199) 9351
84.0%
ValueCountFrequency (%)
0 26
0.2%
1 2
 
< 0.1%
1.67 1
 
< 0.1%
2 6
 
0.1%
2.33 1
 
< 0.1%
2.4 1
 
< 0.1%
2.5 1
 
< 0.1%
2.55 1
 
< 0.1%
2.61 1
 
< 0.1%
2.62 3
 
< 0.1%
ValueCountFrequency (%)
5 22
0.2%
4.91 1
 
< 0.1%
4.88 1
 
< 0.1%
4.86 1
 
< 0.1%
4.83 1
 
< 0.1%
4.82 1
 
< 0.1%
4.8 1
 
< 0.1%
4.78 2
 
< 0.1%
4.76 1
 
< 0.1%
4.75 2
 
< 0.1%

isbn
Text

UNIQUE 

Distinct11127
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:00.757320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9.2079626
Min length7

Characters and Unicode

Total characters102457
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11127 ?
Unique (%)100.0%

Sample

1st row851742718
2nd row1593600119
3rd row674842111
4th row156384155X
5th row439785960
ValueCountFrequency (%)
851742718 1
 
< 0.1%
517226952 1
 
< 0.1%
674842111 1
 
< 0.1%
156384155x 1
 
< 0.1%
439785960 1
 
< 0.1%
439358078 1
 
< 0.1%
439554896 1
 
< 0.1%
043965548x 1
 
< 0.1%
439682584 1
 
< 0.1%
976540606 1
 
< 0.1%
Other values (11117) 11117
99.9%
2024-03-03T15:59:01.116432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12578
12.3%
4 11497
11.2%
0 11182
10.9%
5 10545
10.3%
3 10381
10.1%
2 9465
9.2%
7 9347
9.1%
8 9105
8.9%
6 9079
8.9%
9 8293
8.1%
Other values (2) 985
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101472
99.0%
Uppercase Letter 984
 
1.0%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 12578
12.4%
4 11497
11.3%
0 11182
11.0%
5 10545
10.4%
3 10381
10.2%
2 9465
9.3%
7 9347
9.2%
8 9105
9.0%
6 9079
8.9%
9 8293
8.2%
Uppercase Letter
ValueCountFrequency (%)
X 984
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 101472
99.0%
Latin 985
 
1.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 12578
12.4%
4 11497
11.3%
0 11182
11.0%
5 10545
10.4%
3 10381
10.2%
2 9465
9.3%
7 9347
9.2%
8 9105
9.0%
6 9079
8.9%
9 8293
8.2%
Latin
ValueCountFrequency (%)
X 984
99.9%
x 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 102457
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12578
12.3%
4 11497
11.2%
0 11182
10.9%
5 10545
10.3%
3 10381
10.1%
2 9465
9.2%
7 9347
9.1%
8 9105
8.9%
6 9079
8.9%
9 8293
8.1%
Other values (2) 985
 
1.0%

isbn13
Real number (ℝ)

SKEWED 

Distinct239
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.7598879 × 1012
Minimum8.9870598 × 109
Maximum9.79001 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:01.503086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8.9870598 × 109
5-th percentile9.78006 × 1012
Q19.78035 × 1012
median9.78059 × 1012
Q39.78087 × 1012
95-th percentile9.78193 × 1012
Maximum9.79001 × 1012
Range9.7810229 × 1012
Interquartile range (IQR)5.2 × 108

Descriptive statistics

Standard deviation4.428964 × 1011
Coefficient of variation (CV)0.04537925
Kurtosis442.6346
Mean9.7598879 × 1012
Median Absolute Deviation (MAD)2.5 × 108
Skewness-21.070288
Sum1.0859827 × 1017
Variance1.9615722 × 1023
MonotonicityNot monotonic
2024-03-03T15:59:01.653087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.78014 × 1012662
 
5.9%
9.78006 × 1012654
 
5.9%
9.78045 × 1012455
 
4.1%
9.78039 × 1012389
 
3.5%
9.78074 × 1012374
 
3.4%
9.78031 × 1012349
 
3.1%
9.78159 × 1012344
 
3.1%
9.78081 × 1012342
 
3.1%
9.78068 × 1012308
 
2.8%
9.78038 × 1012300
 
2.7%
Other values (229) 6950
62.5%
ValueCountFrequency (%)
8987059752 1
< 0.1%
2.004913 × 10101
< 0.1%
2.375500432 × 10101
< 0.1%
3.44060546 × 10101
< 0.1%
4.908600776 × 10101
< 0.1%
7.399914077 × 10101
< 0.1%
7.399925491 × 10101
< 0.1%
7.399976844 × 10101
< 0.1%
7.399996082 × 10101
< 0.1%
7.609202599 × 10101
< 0.1%
ValueCountFrequency (%)
9.79001 × 10121
 
< 0.1%
9.79 × 10121
 
< 0.1%
9.78988 × 10123
 
< 0.1%
9.78987 × 10121
 
< 0.1%
9.78986 × 10128
0.1%
9.78983 × 10121
 
< 0.1%
9.78981 × 10123
 
< 0.1%
9.78979 × 10121
 
< 0.1%
9.78977 × 10121
 
< 0.1%
9.78972 × 10126
0.1%

language_code
Categorical

IMBALANCE 

Distinct27
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size87.1 KiB
eng
8911 
en-US
1409 
spa
 
218
en-GB
 
214
fre
 
144
Other values (22)
 
231

Length

Max length5
Median length3
Mean length3.2928912
Min length2

Characters and Unicode

Total characters36640
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)0.1%

Sample

1st roweng
2nd roweng
3rd rowen-US
4th roweng
5th roweng

Common Values

ValueCountFrequency (%)
eng 8911
80.1%
en-US 1409
 
12.7%
spa 218
 
2.0%
en-GB 214
 
1.9%
fre 144
 
1.3%
ger 99
 
0.9%
jpn 46
 
0.4%
mul 19
 
0.2%
zho 14
 
0.1%
grc 11
 
0.1%
Other values (17) 42
 
0.4%

Length

2024-03-03T15:59:01.811290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
eng 8911
80.1%
en-us 1409
 
12.7%
spa 218
 
2.0%
en-gb 214
 
1.9%
fre 144
 
1.3%
ger 99
 
0.9%
jpn 46
 
0.4%
mul 19
 
0.2%
zho 14
 
0.1%
grc 11
 
0.1%
Other values (17) 42
 
0.4%

Most occurring characters

ValueCountFrequency (%)
e 10791
29.5%
n 10592
28.9%
g 9024
24.6%
- 1630
 
4.4%
U 1409
 
3.8%
S 1409
 
3.8%
p 275
 
0.8%
r 270
 
0.7%
a 231
 
0.6%
s 224
 
0.6%
Other values (16) 785
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31750
86.7%
Uppercase Letter 3260
 
8.9%
Dash Punctuation 1630
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10791
34.0%
n 10592
33.4%
g 9024
28.4%
p 275
 
0.9%
r 270
 
0.9%
a 231
 
0.7%
s 224
 
0.7%
f 144
 
0.5%
j 46
 
0.1%
l 27
 
0.1%
Other values (9) 126
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
U 1409
43.2%
S 1409
43.2%
G 214
 
6.6%
B 214
 
6.6%
C 7
 
0.2%
A 7
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 1630
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 35010
95.6%
Common 1630
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10791
30.8%
n 10592
30.3%
g 9024
25.8%
U 1409
 
4.0%
S 1409
 
4.0%
p 275
 
0.8%
r 270
 
0.8%
a 231
 
0.7%
s 224
 
0.6%
G 214
 
0.6%
Other values (15) 571
 
1.6%
Common
ValueCountFrequency (%)
- 1630
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36640
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 10791
29.5%
n 10592
28.9%
g 9024
24.6%
- 1630
 
4.4%
U 1409
 
3.8%
S 1409
 
3.8%
p 275
 
0.8%
r 270
 
0.7%
a 231
 
0.6%
s 224
 
0.6%
Other values (16) 785
 
2.1%

num_pages
Real number (ℝ)

Distinct997
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean336.37692
Minimum0
Maximum6576
Zeros76
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:01.948129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile48
Q1192
median299
Q3416
95-th percentile752
Maximum6576
Range6576
Interquartile range (IQR)224

Descriptive statistics

Standard deviation241.12731
Coefficient of variation (CV)0.71683665
Kurtosis62.422129
Mean336.37692
Median Absolute Deviation (MAD)107
Skewness4.2717867
Sum3742866
Variance58142.377
MonotonicityNot monotonic
2024-03-03T15:59:02.299260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
288 230
 
2.1%
192 221
 
2.0%
320 218
 
2.0%
256 207
 
1.9%
352 202
 
1.8%
224 198
 
1.8%
208 178
 
1.6%
304 177
 
1.6%
240 173
 
1.6%
384 172
 
1.5%
Other values (987) 9151
82.2%
ValueCountFrequency (%)
0 76
0.7%
1 11
 
0.1%
2 15
 
0.1%
3 19
 
0.2%
4 11
 
0.1%
5 16
 
0.1%
6 20
 
0.2%
7 6
 
0.1%
8 10
 
0.1%
9 11
 
0.1%
ValueCountFrequency (%)
6576 1
< 0.1%
4736 1
< 0.1%
3400 1
< 0.1%
3342 1
< 0.1%
3020 1
< 0.1%
2751 1
< 0.1%
2690 1
< 0.1%
2480 1
< 0.1%
2264 1
< 0.1%
2198 1
< 0.1%

ratings_count
Real number (ℝ)

HIGH CORRELATION 

Distinct5294
Distinct (%)47.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17936.409
Minimum0
Maximum4597666
Zeros81
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:02.454464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8
Q1104
median745
Q34993.5
95-th percentile61096
Maximum4597666
Range4597666
Interquartile range (IQR)4889.5

Descriptive statistics

Standard deviation112479.44
Coefficient of variation (CV)6.2710123
Kurtosis442.42766
Mean17936.409
Median Absolute Deviation (MAD)727
Skewness17.697061
Sum1.9957842 × 108
Variance1.2651625 × 1010
MonotonicityNot monotonic
2024-03-03T15:59:02.608083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 82
 
0.7%
0 81
 
0.7%
1 76
 
0.7%
4 71
 
0.6%
2 71
 
0.6%
5 61
 
0.5%
9 60
 
0.5%
8 59
 
0.5%
6 57
 
0.5%
7 56
 
0.5%
Other values (5284) 10453
93.9%
ValueCountFrequency (%)
0 81
0.7%
1 76
0.7%
2 71
0.6%
3 82
0.7%
4 71
0.6%
5 61
0.5%
6 57
0.5%
7 56
0.5%
8 59
0.5%
9 60
0.5%
ValueCountFrequency (%)
4597666 1
< 0.1%
2530894 1
< 0.1%
2457092 1
< 0.1%
2418736 1
< 0.1%
2339585 1
< 0.1%
2293963 1
< 0.1%
2153167 1
< 0.1%
2128944 1
< 0.1%
2111750 1
< 0.1%
2095690 1
< 0.1%

text_reviews_count
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1822
Distinct (%)16.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean541.8545
Minimum0
Maximum94265
Zeros625
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:02.747609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median46
Q3237.5
95-th percentile2158.7
Maximum94265
Range94265
Interquartile range (IQR)228.5

Descriptive statistics

Standard deviation2576.1766
Coefficient of variation (CV)4.7543697
Kurtosis396.701
Mean541.8545
Median Absolute Deviation (MAD)44
Skewness16.177845
Sum6029215
Variance6636685.9
MonotonicityNot monotonic
2024-03-03T15:59:02.902930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 625
 
5.6%
1 458
 
4.1%
2 354
 
3.2%
3 263
 
2.4%
4 249
 
2.2%
5 223
 
2.0%
6 200
 
1.8%
7 180
 
1.6%
9 164
 
1.5%
8 162
 
1.5%
Other values (1812) 8249
74.1%
ValueCountFrequency (%)
0 625
5.6%
1 458
4.1%
2 354
3.2%
3 263
2.4%
4 249
 
2.2%
5 223
 
2.0%
6 200
 
1.8%
7 180
 
1.6%
8 162
 
1.5%
9 164
 
1.5%
ValueCountFrequency (%)
94265 1
< 0.1%
86881 1
< 0.1%
56604 1
< 0.1%
55843 1
< 0.1%
52759 1
< 0.1%
47951 1
< 0.1%
47620 1
< 0.1%
46176 1
< 0.1%
43499 1
< 0.1%
36325 1
< 0.1%
Distinct3679
Distinct (%)33.1%
Missing0
Missing (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:03.136920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.6969534
Min length9

Characters and Unicode

Total characters107898
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2022 ?
Unique (%)18.2%

Sample

1st row05-01-1977
2nd row04-06-2004
3rd row4/20/2004
4th row1/15/1999
5th row9/16/2006
ValueCountFrequency (%)
10-01-2005 56
 
0.5%
11-01-2005 53
 
0.5%
09-01-2006 51
 
0.5%
10-01-2006 48
 
0.4%
11-01-2006 40
 
0.4%
07-01-2004 39
 
0.4%
08-01-2006 39
 
0.4%
08-01-2005 37
 
0.3%
07-01-2003 37
 
0.3%
10-01-2004 37
 
0.3%
Other values (3669) 10690
96.1%
2024-03-03T15:59:03.538312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 28827
26.7%
1 15692
14.5%
2 13218
12.3%
- 13144
12.2%
/ 9110
 
8.4%
9 8403
 
7.8%
6 3780
 
3.5%
5 3683
 
3.4%
3 3359
 
3.1%
4 3084
 
2.9%
Other values (2) 5598
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 85644
79.4%
Dash Punctuation 13144
 
12.2%
Other Punctuation 9110
 
8.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 28827
33.7%
1 15692
18.3%
2 13218
15.4%
9 8403
 
9.8%
6 3780
 
4.4%
5 3683
 
4.3%
3 3359
 
3.9%
4 3084
 
3.6%
7 2826
 
3.3%
8 2772
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 13144
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 9110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 107898
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 28827
26.7%
1 15692
14.5%
2 13218
12.3%
- 13144
12.2%
/ 9110
 
8.4%
9 8403
 
7.8%
6 3780
 
3.5%
5 3683
 
3.4%
3 3359
 
3.1%
4 3084
 
2.9%
Other values (2) 5598
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 107898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 28827
26.7%
1 15692
14.5%
2 13218
12.3%
- 13144
12.2%
/ 9110
 
8.4%
9 8403
 
7.8%
6 3780
 
3.5%
5 3683
 
3.4%
3 3359
 
3.1%
4 3084
 
2.9%
Other values (2) 5598
 
5.2%
Distinct2292
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Memory size87.1 KiB
2024-03-03T15:59:03.794138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length67
Median length52
Mean length15.180372
Min length2

Characters and Unicode

Total characters168912
Distinct characters140
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1296 ?
Unique (%)11.6%

Sample

1st rowBrown Son & Ferguson Ltd.
2nd rowCold Spring Press
3rd rowHarvard University Press
4th rowHuntington House Publishers
5th rowScholastic Inc.
ValueCountFrequency (%)
books 2302
 
9.3%
press 1316
 
5.3%
penguin 598
 
2.4%
university 552
 
2.2%
publishing 511
 
2.1%
vintage 409
 
1.6%
353
 
1.4%
classics 344
 
1.4%
company 331
 
1.3%
house 320
 
1.3%
Other values (2016) 17765
71.6%
2024-03-03T15:59:04.240974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14220
 
8.4%
o 12967
 
7.7%
e 12949
 
7.7%
s 11914
 
7.1%
r 11785
 
7.0%
i 10689
 
6.3%
a 10476
 
6.2%
n 10462
 
6.2%
l 6209
 
3.7%
t 5820
 
3.4%
Other values (130) 61421
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 126142
74.7%
Uppercase Letter 26043
 
15.4%
Space Separator 14220
 
8.4%
Other Punctuation 1640
 
1.0%
Other Letter 235
 
0.1%
Open Punctuation 234
 
0.1%
Close Punctuation 234
 
0.1%
Dash Punctuation 101
 
0.1%
Decimal Number 56
 
< 0.1%
Final Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
7.7%
16
 
6.8%
16
 
6.8%
16
 
6.8%
13
 
5.5%
12
 
5.1%
10
 
4.3%
10
 
4.3%
8
 
3.4%
8
 
3.4%
Other values (34) 108
46.0%
Lowercase Letter
ValueCountFrequency (%)
o 12967
10.3%
e 12949
10.3%
s 11914
9.4%
r 11785
9.3%
i 10689
 
8.5%
a 10476
 
8.3%
n 10462
 
8.3%
l 6209
 
4.9%
t 5820
 
4.6%
u 4146
 
3.3%
Other values (31) 28725
22.8%
Uppercase Letter
ValueCountFrequency (%)
P 4241
16.3%
B 3806
14.6%
C 2151
 
8.3%
S 1724
 
6.6%
H 1713
 
6.6%
A 1365
 
5.2%
M 1283
 
4.9%
L 1078
 
4.1%
D 931
 
3.6%
R 851
 
3.3%
Other values (18) 6900
26.5%
Decimal Number
ValueCountFrequency (%)
1 14
25.0%
3 13
23.2%
0 11
19.6%
2 5
 
8.9%
8 5
 
8.9%
4 4
 
7.1%
7 2
 
3.6%
9 1
 
1.8%
6 1
 
1.8%
Other Punctuation
ValueCountFrequency (%)
. 833
50.8%
' 330
 
20.1%
& 329
 
20.1%
/ 133
 
8.1%
: 7
 
0.4%
; 4
 
0.2%
" 2
 
0.1%
! 2
 
0.1%
Nonspacing Mark
ValueCountFrequency (%)
̄ 1
33.3%
̃ 1
33.3%
́ 1
33.3%
Open Punctuation
ValueCountFrequency (%)
( 233
99.6%
[ 1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 233
99.6%
] 1
 
0.4%
Space Separator
ValueCountFrequency (%)
14220
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 101
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 152180
90.1%
Common 16489
 
9.8%
Han 186
 
0.1%
Katakana 49
 
< 0.1%
Cyrillic 5
 
< 0.1%
Inherited 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 12967
 
8.5%
e 12949
 
8.5%
s 11914
 
7.8%
r 11785
 
7.7%
i 10689
 
7.0%
a 10476
 
6.9%
n 10462
 
6.9%
l 6209
 
4.1%
t 5820
 
3.8%
P 4241
 
2.8%
Other values (54) 54668
35.9%
Han
ValueCountFrequency (%)
18
 
9.7%
16
 
8.6%
16
 
8.6%
16
 
8.6%
13
 
7.0%
12
 
6.5%
8
 
4.3%
8
 
4.3%
6
 
3.2%
6
 
3.2%
Other values (24) 67
36.0%
Common
ValueCountFrequency (%)
14220
86.2%
. 833
 
5.1%
' 330
 
2.0%
& 329
 
2.0%
( 233
 
1.4%
) 233
 
1.4%
/ 133
 
0.8%
- 101
 
0.6%
1 14
 
0.1%
3 13
 
0.1%
Other values (14) 50
 
0.3%
Katakana
ValueCountFrequency (%)
10
20.4%
10
20.4%
6
12.2%
5
10.2%
5
10.2%
5
10.2%
5
10.2%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Cyrillic
ValueCountFrequency (%)
с 1
20.0%
о 1
20.0%
м 1
20.0%
Э 1
20.0%
к 1
20.0%
Inherited
ValueCountFrequency (%)
̄ 1
33.3%
̃ 1
33.3%
́ 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 168602
99.8%
CJK 186
 
0.1%
None 63
 
< 0.1%
Katakana 49
 
< 0.1%
Cyrillic 5
 
< 0.1%
Punctuation 4
 
< 0.1%
Diacriticals 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14220
 
8.4%
o 12967
 
7.7%
e 12949
 
7.7%
s 11914
 
7.1%
r 11785
 
7.0%
i 10689
 
6.3%
a 10476
 
6.2%
n 10462
 
6.2%
l 6209
 
3.7%
t 5820
 
3.5%
Other values (65) 61111
36.2%
None
ValueCountFrequency (%)
é 27
42.9%
ü 8
 
12.7%
ç 6
 
9.5%
ë 5
 
7.9%
É 4
 
6.3%
ı 3
 
4.8%
í 3
 
4.8%
ó 2
 
3.2%
ö 2
 
3.2%
ñ 1
 
1.6%
Other values (2) 2
 
3.2%
CJK
ValueCountFrequency (%)
18
 
9.7%
16
 
8.6%
16
 
8.6%
16
 
8.6%
13
 
7.0%
12
 
6.5%
8
 
4.3%
8
 
4.3%
6
 
3.2%
6
 
3.2%
Other values (24) 67
36.0%
Katakana
ValueCountFrequency (%)
10
20.4%
10
20.4%
6
12.2%
5
10.2%
5
10.2%
5
10.2%
5
10.2%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Punctuation
ValueCountFrequency (%)
4
100.0%
Cyrillic
ValueCountFrequency (%)
с 1
20.0%
о 1
20.0%
м 1
20.0%
Э 1
20.0%
к 1
20.0%
Diacriticals
ValueCountFrequency (%)
̄ 1
33.3%
̃ 1
33.3%
́ 1
33.3%

Interactions

2024-03-03T15:58:57.028257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:53.235345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.091295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.841241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.585960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.294214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:57.126292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:53.333191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.220755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.955365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.684058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.405181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:57.234861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:53.581266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.340095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.076263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.801100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.539527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:57.344349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:53.682147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.493172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.196209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.923161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.656057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:57.525083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:53.811909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.611558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.310272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.051660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.807137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:57.628283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:53.927219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:54.726544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:55.439262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.168844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-03T15:58:56.915576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-03T15:59:04.355758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
average_ratingbookIDisbn13language_codenum_pagesratings_counttext_reviews_count
average_rating1.000-0.0370.0540.1010.1100.0870.033
bookID-0.0371.0000.0410.050-0.010-0.099-0.112
isbn130.0540.0411.0000.000-0.137-0.252-0.264
language_code0.1010.0500.0001.0000.016-0.048-0.054
num_pages0.110-0.010-0.1370.0161.0000.1850.168
ratings_count0.087-0.099-0.252-0.0480.1851.0000.959
text_reviews_count0.033-0.112-0.264-0.0540.1680.9591.000

Missing values

2024-03-03T15:58:57.783754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-03T15:58:58.002856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

bookIDtitleauthorsaverage_ratingisbnisbn13language_codenum_pagesratings_counttext_reviews_countpublication_datepublisher
034889Brown's Star Atlas: Showing All The Bright Stars With Full Instructions How To Find And Use Them For Navigational Purposes And Department Of Trade Examinations.Brown Son & Ferguson0.008517427189,780,850,000,000.00eng490005-01-1977Brown Son & Ferguson Ltd.
116914The Tolkien Fan's Medieval ReaderDavid E. Smith (Turgon of TheOneRing.net one of the founding members of this Tolkien website)/Verlyn Flieger/Turgon (=David E. Smith)3.5815936001199,781,590,000,000.00eng40026404-06-2004Cold Spring Press
212224Streetcar Suburbs: The Process of Growth in Boston 1870-1900Sam Bass Warner Jr./Sam B. Warner3.586748421119,780,670,000,000.00en-US2366164/20/2004Harvard University Press
322128Patriots (The Coming Collapse)James Wesley Rawles3.63156384155X9,781,560,000,000.00eng3423841/15/1999Huntington House Publishers
41Harry Potter and the Half-Blood Prince (Harry Potter #6)J.K. Rowling/Mary GrandPré4.574397859609,780,440,000,000.00eng6522095690275919/16/2006Scholastic Inc.
52Harry Potter and the Order of the Phoenix (Harry Potter #5)J.K. Rowling/Mary GrandPré4.494393580789,780,440,000,000.00eng87021531672922109-01-2004Scholastic Inc.
64Harry Potter and the Chamber of Secrets (Harry Potter #2)J.K. Rowling4.424395548969,780,440,000,000.00eng352633324411-01-2003Scholastic
75Harry Potter and the Prisoner of Azkaban (Harry Potter #3)J.K. Rowling/Mary GrandPré4.56043965548X9,780,440,000,000.00eng43523395853632505-01-2004Scholastic Inc.
88Harry Potter Boxed Set Books 1-5 (Harry Potter #1-5)J.K. Rowling/Mary GrandPré4.784396825849,780,440,000,000.00eng2690414281649/13/2004Scholastic
99Unauthorized Harry Potter Book Seven News: "Half-Blood Prince" Analysis and SpeculationW. Frederick Zimmerman3.749765406069,780,980,000,000.00en-US1521914/26/2005Nimble Books
bookIDtitleauthorsaverage_ratingisbnisbn13language_codenum_pagesratings_counttext_reviews_countpublication_datepublisher
1111745617O Cavalo e o Seu Rapaz (As Crónicas de Nárnia #3)C.S. Lewis/Pauline Baynes/Ana Falcão Bastos3.9297223305519,789,720,000,000.00por160207168/15/2003Editorial Presença
1111845623O Sobrinho do Mágico (As Crónicas de Nárnia #1)C.S. Lewis/Pauline Baynes/Ana Falcão Bastos4.0497223299879,789,720,000,000.00por1473963704-08-2003Editorial Presença
1111945625A Viagem do Caminheiro da Alvorada (As Crónicas de Nárnia #5)C.S. Lewis/Pauline Baynes/Ana Falcão Bastos4.0997223313299,789,720,000,000.00por1761611409-01-2004Editorial Presença
1112045626O Príncipe Caspian (As Crónicas de Nárnia #4)C.S. Lewis/Pauline Baynes/Ana Falcão Bastos3.9797223309779,789,720,000,000.00por1602151110-11-2003Editorial Presença
1112145630Whores for GloriaWilliam T. Vollmann3.691402315799,780,140,000,000.00en-US16093211102-01-1994Penguin Books
1112245631Expelled from Eden: A William T. Vollmann ReaderWilliam T. Vollmann/Larry McCaffery/Michael Hemmingson4.0615602544169,781,560,000,000.00eng5121562012/21/2004Da Capo Press
1112345633You Bright and Risen AngelsWilliam T. Vollmann4.081401108799,780,140,000,000.00eng6357835612-01-1988Penguin Books
1112445634The Ice-Shirt (Seven Dreams #1)William T. Vollmann3.961401319659,780,140,000,000.00eng4158209508-01-1993Penguin Books
1112545639Poor PeopleWilliam T. Vollmann3.72608788279,780,060,000,000.00eng4347691392/27/2007Ecco
1112645641Las aventuras de Tom SawyerMark Twain3.9184976469839,788,500,000,000.00spa272113125/28/2006Edimat Libros